Calculating Semantic Distance between Word Sense Probability Distributions

نویسندگان

  • Vivian Tsang
  • Suzanne Stevenson
چکیده

Semantic similarity measures have focused on individual word senses. However, in many applications, it may be informative to compare the overall sense distributions for two different contexts. We propose a new method for comparing two probability distributions over WordNet, which captures in a single measure the aggregate semantic distance of the component nodes, weighted by their probability. Previous such measures compute only the distributional distance, and do not take into account the semantic similarity between WordNet senses across the distributions. To incorporate semantic similarity, we calculate the (dis)similarity between two probability distributions as a weighted distance “travelled” from one to the other through the WordNet hierarchy. We evaluate the measure by applying it to the acquisition of verb argument alternation knowledge, and find that overall it outperforms existing distance measures.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Designing Web Warehouses from XML Schemas

Conference Paper: Designing Web Warehouses from XML Schemas. Calculating Word Sense Probability Distributions for Semantic Web Applications. This paper proposes a hierarchical design framework conversion of XML schema into the various Data Warehouse schema based on ROLAP. In recent years, the Web has been growing incredibly and has become main information. So we are good to design a Data Servic...

متن کامل

Semantic Distances for Sets of Senses and Applications in Word Sense Disambiguation

There has been an increasing interest both from the Information Retrieval community and the Data Mining community in investigating possible advantages of using Word Sense Disambiguation (WSD) for enhancing semantic information in the Information Retrieval and Data Mining process. Although contradictory results have been reported, there are strong indications that the use of WSD can contribute t...

متن کامل

Random Walk on WordNet to Measure Lexical Semantic Relatedness

The need to determine semantic relatedness or its inverse, semantic distance, between two lexically expressed concepts is a problem that pervades much of natural language processing such as document summarization, information extraction and retrieval, word sense disambiguation and the automatic correction of word errors in text. Standard ways of measuring similarity between two words on a thesa...

متن کامل

Distributional measures of concept-distance: A task-oriented evaluation

We propose a framework to derive the distance between concepts from distributional measures of word co-occurrences. We use the categories in a published thesaurus as coarse-grained concepts, allowing all possible distance values to be stored in a concept–concept matrix roughly .01% the size of that created by existing measures. We show that the newly proposed concept-distance measures outperfor...

متن کامل

Semantic Distance Measures with Distributional Profiles of Coarse-Grained Concepts

Although semantic distance measures are applied to words in textual tasks such as building lexical chains, semantic distance is really a property of concepts, not words. After discussing the limitations of measures based solely on lexical resources such as WordNet or solely on distributional data from text corpora, we present a hybrid measure of semantic distance based on distributional profile...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004